Flexible Approximate String Matching Application on a Heterogeneous Distributed Environment

نویسندگان

  • Panagiotis D. Michailidis
  • Konstantinos G. Margaritis
چکیده

In this paper, we present three parallel flexible approximate string matching methods on a parallel architecture with heterogeneous workstations to gain supercomputer power at low cost. The first method is the static masterworker with uniform distribution strategy, the second one is the dynamic master-worker with allocation of subtexts and the third one is the dynamic master-worker with allocation of text pointers. Further, we propose a hybrid parallel method that combines the advantages of static and dynamic parallel methods in order to reduce the load imbalance and communication overhead. This hybrid method is based on the following optimal distribution strategy: the text collection is distributed proportional to workstation’s speed. We evaluated the performance of four methods with clusters 1, 2, 4, 6 and 8 heterogeneous workstations. The experimental results demonstrate that the dynamic allocation of text pointers and hybrid methods achieve better performance than the two original ones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalized Performance Model for Flexible Approximate String Matching on a Distributed System

This paper proposes a generalized and practical parallel algorithm for flexible approximate string matching which is executed for several kinds of clusters such as homogeneous cluster and heterogeneous cluster. This parallel algorithm is based on the master worker paradigm and it implements different partitioning schemes such as static and dynamic load balancing cooperating with different data ...

متن کامل

A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment

In this paper, we present three parallel approximate string matching methods on a parallel architecture with heterogeneous workstations to gain supercomputer power at low cost. The first method is the static master-worker with uniform distribution strategy, the second one is the dynamic master-worker with allocation of subtexts and the third one is the dynamic master-worker with allocation of t...

متن کامل

Performance evaluation of load balancing strategies for approximate string matching application on an MPI cluster of heterogeneous workstations

In this paper, we present three parallel approximate string matching methods on a parallel architecture with heterogeneous workstations to gain supercomputer power at low cost. The first method is the static master–worker with uniform distribution strategy, the second one is the dynamic master–worker with allocation of subtexts and the third one is the dynamic master–worker with allocation of t...

متن کامل

Parallel Architecture for Flexible Approximate Text Searching

This paper presents a processor array design for flexible approximate string matching. Initially, a sequential algorithm is discussed which consists of two phases, i.e. preprocessing and searching. Then, starting from the computational schedule of the searching phase a parallel architecture is derived. Further, the preprocessing phase is also accomodated onto the same architecture. Key-Words: A...

متن کامل

A study on company name matching for database integration

In this report we describe an activity of information integration performed on databases with patent data and company indicators. Depending on the application area, this kind of activity is known as record linkage, duplicate detection, record matching, reference reconciliation or other domain-specific terms. In particular, we present a detailed case study on company name matching. We show how t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003